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On  the  equivalence  of  probability  measures 

Let  0  be  a  set  (u  denotes  one  of  Its  elements)  and  A  be  a  o-algebra  of 
subsets  of  0.  If  (P,Q)  is  a  pair  of  probability  measures  on  A,  we  are  Interested 
in  describing  the  relations  which  may  exist  between  P  and  Q. 

To  begin,  suppose  that  P  and  Q  are  only  o-finlte  and  that  Q  is  a  signed 
measure. 

Q  is  absolutely  continuous  with  respect  to  P  (notation:  Q  «  P)  if 
Q(A)  »  0  whenever  P(A)  -  0  and  Q  is  singular  with  respect  to  P  if  there  exists 
a  set  A  in  A  such  that  Q(A)  »  P(fi\A)  *  0  (notation:  Q  P) .  When  Q  «  P,  Q 
has  the  following  representation: 


Q(A)  ■  /  f(ui)P(dw)  , 

A 

where  f  is  measurable  with  respect  to  A  and  unique  within  a  set  of  P-measure  zero, 
f  is  called  the  Radon-Nlkodym  derivative  of  Q  with  respect  to  P  and  usually  one 
writes  dQ/dP  for  f. 

The  most  general  statement  about  the  pair  (P,Q)  which  is  available  is  the 
Lebesgue  decomposition  theorem 
Q  •  Qj  +  Q2  uniquely,  where 

(a)  Qj  is  a  o-finite  signed  measure  and  «  P 

(b)  Q2  is  a  o-finlte  signed  measure  and  Q2  J_  P 

Examples  illustrating  these  concepts  are  easy  to  find  and  we  shall  give  two. 

Let:  Q  ■  K,  A  ■  8[»]  (the  Borel  sets  of  ft),  Leb  =  Lebesgue  measure. 

1.  Let  P-Leb  and  Q*  Gaussian  measure  with  mean  zero  and  variance  1,  that  is 


Then  Q  «  P  and  dQ/ dP  ■  $ . 


P(A)  -  /  P{dx)  -  /  (x)^(x)PCdx)  -  /  rl(x)Q(dx)  . 
A  A  A 


Then  P  «  Q  and  dP/dQ  -  $-1 .  When  both  P  «  Q  and  Q  «  P,  P  and  Q  are  said  to  be 
mutually  absolutely  continuous  (notation  P  =  Q) . 

2.  Let 

j  x'  ia’D1 

a  <  b 

x£ft\[a,b] 


Set 


(A)  -  /  A  .  (x)P(dx) ,  P  -  Leb. 
A  a,D 


U  ,  is  called  the  uniform  measure  on  [a,b].  Q  is  as  in  1. 
a,  d 

One  has 


b(A)  -  /  b(x)4-1(x)Q(dx)  ,  so  that 


.«d  io.y’Q  ■  *.,b  •  * 


-1 


Let  Qj(A)  -  Q(AOle.bJ)  and  Q2(A)  -  Q<Anm\[a,b] }) .  Then,  if  I[a>b)(x) 


is  one  where  a  s  x  s  b  and  sero  elsewhere. 
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Q1(A)  "  /  I[a,b](x)Q(dx)  "  (b'a)/  I[a,b] (x)^(x)Ua.b(dx) 

A  A 

so  that  Q,  «  U  .  and  dQ,/dU  .  ■  (b-a)I,  ,,(x)$(x).  Furthermore, 

i  a,b  1  a,b  la.bj 

Q_([a,b])  »  Q((?)  -  0  and  U  ,(ft\[a,b])  »  0,  so  that  Q-  i  U  ..  Since 
‘  a,  d  t  ^  a,  d 

Q  =  Q^Qj,  (Qj.Qj)  is  the  Lebesgue  decomposition  of  Q  with  respect  to 

The  problem  of  obtaining  the  Lebesgue  decomposition  is  particularly 
interesting  and  useful  when  fl  is  a  space  of  functions  (or  of  classes  of  func¬ 
tions)  and  P  and  Q  are  induced  by  stochastic  processes.  Typically  SI  is  C[0,T], 
the  space  of  continuous  functions  on  [0,T],  or  L2[0,T],  the  space  of  equivalence 
classes  of  almost  surely  equal  functions  on  [0,T]  which  are  square- integrable. 

A  is  then  the  o-algebra  generated  by  the  open  sets  of  12.  Thus  if  X:  12  x  [0,T]+  ft 
is  a  stochastic  process  with  continuous  paths,  X:  12-»-C(0,T]  defined  by 
X(w)  *  {X(oi,t)  ,0<  tsT)  is  a  measurable  map,  so  that  Px  ■  Pojc-1  is  a  probability 
measure  on  the  Borel  sets  of  C[0,T].  More  generally  if  F  is  a  linear  space  of 
functions  f,  Px  is  defined  by  relations  of  the  type: 

Px{f:(^1(f),...,^n(f))  e  B}  -  P(w:  (^(XCw) ),..., *n (X(u>) )  e  B}  , 

^j,...,^n  being  continuous  linear  functionals  on  F. 

There  are  two  major  methods  to  obtain  results  about  the  absolute  continuity 
of  measures  Px  and  P^.  The  first  consists  in  choosing  X  and  Y  with  probability 
laws  of  the  same  type  (two  Gaussian,  two  diffusions,  etc.)  and  using  this  type 
to  characterize  absolute  continuity,  or  equivalence,  and  the  second  consists  In 
choosing  for  X  a  martingale  (for  example  Brownian  motion)  and  for  Y  a  process 
of  the  form  V-f  X,  where  V  is  of  bounded  variation.  The  privileged  tool  is  then 
stochastic  calculus.  We  shall  consider  below  a  problem  in  the  second  category 
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and  motivate  it  from  considerations  which  arise  in  statistical  conmmication 
theory.  If  V  is  independent  of  X,  absolute  continuity  can  in  certain  cases  be 
obtained  if  absolute  continuity  is  known  to  hold  for  V  non-random,  without 
recourse  to  stochastic  calculus. 

In  statistical  communication  theory  one  tries  to  accurately  transmit 
information  over  communication  channels.  The  information  to  be  transmitted  is 
called  a  signal  (S)  and  impediments  to  accurate  transmission  are  called  noise 
(N) .  The  theory  is  statistical  in  the  sense  that  S  and  N  are  stochastic  pro¬ 
cesses  (impediments  can  only  be  known  "on  average"  and  information,  like  the 
words  of  a  language,  has  a  random  component:  only  the  frequency  of  appearance 
of  given  words  can  be  approximated) .  Communication  channels  operate  as  "black 
boxes":  their  description  from  basic  physical  principles  is  usually  either 
impossible  or  too  complicated  to  be  of  much  use.  The  signal  S  is  the  input  to 
the  box  and  the  output  is  some  function  of  S  and  N,  say  f (S , W) .  He  shall  con¬ 
sider  here  the  case  of  f (S,N)  -  S+N,  which  has  proved  useful.  Most  communica¬ 
tion  systems  are  monitored  continuously  and  the  operator  is  given  a  function 
(x(t),  0  <  t  <  T}  which  can  be  noise  or  a  distorted  signal.  So,  the  first  aim  is 
to  establish  whether  x  is  a  realization  of  N,  that  is  x(t)  ■  N(co,t),  for  some 
fixed  u,  or  whether  x  is  a  realization  of  S  +  N,  that  is  x(t)  •  N(w,t)  +  S(w,t) . 
If  a  decision  can  be  reached  with  no  possibility  of  error,  the  problem  is  said 
to  be  singular:  F,  the  space  of  paths  for  N  and  S  +  N,  can  be  partitioned  into 
two  measurable  disjoint  subsets  Fj  and  F2  such  that  P^F^ 
the  requirement  that  a  correct  decision  be  always  reached,  say  in  case  of  no 
signal,  forces  the  user  to  ignore  the  data,  the  problem  is  said  to  be  non¬ 
singular:  for  every  measurable  subset  Fj  of  F,  P^(F^)  ■  1  implies  P$4N(V  *  ** 
Singularity  corresponds  thus  to  the  case  J_  P^(  and  non-singularity  to  the 


In  the  latter  case,  a  good  decision  procedure  consists  in 


case  ?N  =  P^. 

looking  at  dP^^/dP^ac)  and  in  deciding  that  a  signal  is  present  when  this  is 
large  (that  such  a  procedure  is  good  follows  from  the  Neyman-Pearson  lemma 
of  statistics).  dPt»|ty/dPfj  can  always  be  thought  of  as  the  ratio  of  two  densi¬ 
ties,  say  PS4*/  and  Vy 


S+N 


(A)  -  / 
A 


dFS+N  /  dPN 

d[p‘'+p 


dtPW+PSfW1 


dlPN+PW 


and  the  choice  consists  in  deciding  "no  signal"  if  cP^(x)  >  Pq  |  fj(x)  and  "signal” 
if  c0jy(x)  <  Pt>t^(x)  (for  some  adequately  chosen  c) . 

A  classical  problem  of  the  type  just  described  has  the  form:  N  is  a 
Gaussian  process,  but  no  frequency  requirements  are  Imposed  on  S.  The  results 
which  shall  be  stated  have  been  obtained  by  C.  R.  Baker  and  A.  F.  Gualtlerottl 
(UT  Electrical  Engineering,  Fall  1981). 


The  model: 

(fl.A.P)  is  the  basic  probability  space  specified  by  the  experiment  and 
the  information  as  time  evolves  is  contained  in  A  -  (At,  0<t<T},  where  At  is 
a  o-algebra  of  subsets  of  R,  and  AtsA,  s<  t  -*Ag  §  Afc. 

(A)  The  noise  N 

Let  B(w,t)«  (Sj(u,t),.. .,8n(w,t))t,  the  processes  8^  being  Independent, 
Gaussian  with  independent  increments  and  with  continuous  variance  B^(t)  -  Ef  ^Cui.t) 
(E8^(o),t))  •  0) .  8  is  adapted  to  A^,  that  is  8(»,t)  is  measurable  with  respect  to 

A£.  The  diagonal  matrix  with  entries  8^(0  Is  denoted  Eg(t).  It  can  be  shown 
that  8^  is  a  continuous  square  lntegrable  martingale  with  Increasing  process  Bk 
(one  can  choose  for  8fc  the  Wiener  process  Wfc  and  then  0^(0  -  t).  F(t,x)  is  a 
function  defined  on  [0,T]  x  [0,T]:  it  is  measurable  and  F(t,x)  ■  £  for  x>  t. 
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E  /  ^s(ta),x)  ,Zg(dx)£(u,x)^  <  « 


Remark:  Let  R^(s,t)  -  EN( • ,s)M(* ,t)  -  /  ^F(t,x) ,Eg(dx)F(t,x)^  .  With  R^, 

0 

one  can  associate  a  Hilbert  space  of  functions  H(R^)  in  the  following  way: 

n 

on  the  vector  space  of  functions  Y  <*«,  R«,(*,t.),  define 

w  k  N  k 


^Jl  an  Jj  6*  RM(‘*S1)>H(Rw)  “  J1  tlx  °k  B£  RN(8£'tk) 


H(RW)  is  the  completion  of  this  vector  space  for  the  norm  ||*|Ih(R  and  is 

W 

called  the  reproducing  kernel  Hilbert  space  of  R^.  The  name  comes  from  the 
relation 


<RN(*’x)’  akRN<*‘tk»H(RM)  *  Jj  “kV^V 


Since 


J-  J-  akB*  WvV  "  l  °k  I<Vx),E8(dx)  ^  Bt-(VX^ 

k-l  £-1  0  k-l  £-1 


and 


l  «kMu.O  -  /<F(u,x),Eg(dx)  l  o.F(t.,x)> 
k-l  *  0  k-l  K 


one  has  here  a  representation  of  H(RM): 


H(RN)  -  (a(t)  -  /  {F(t,x),Eg(dx)A(x)) 


Ae  subspace  of  L2(lg]  generated  by  F(t,»)} 
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<a,5^H(RN) 


T 

/  (A(x),Eg(dx)A(x)^ 


One  thus  sees  that  the  signal  S  has  been  chosen  to  belong  to  the  reproducing 
kernel  Hilbert  space  of  the  noise.  This  Is  no  coincidence:  in  many  instances 
this  is  a  necessary  condition  for  equivalence. 


Example:  Let  N  *  W,  a  Wiener  process  on  [0,T).  Then 


t  T 

H(RW)  -  {a(t)  -  /  a(x)dx,  /  a2(x)dx  <  »} 
0  0 

T 

‘  I  <■<«>;<*>*>  ’  <°'“>L2[0,T]  ' 


shown  that  the  process  W  (u,t)  ■  a(t)  +  W(u,t)  has  a  law  P  equi- 

A  W 

a 

Py,  the  law  of  W,  if  and  only  if  ae  H(R^) .  More  generally,  the  law 
process  Y  is  equivalent  to  Py  "if  and  only  if"  Y  has  the  form 


t 

Y(u,t)  ■  /  a(u,x)dx  +  W(u,x) 
0 

T 

P  {  /  a2(w,x)dx  <  •}  -  1 
0 


(The  "  "  indicate  that  many  details  concerning  a  and  Y  have  been  left  out.) 

We  now  have  the  following  results: 

I)  There  exists  a  measurable  map  T:CnfO,T]  ♦  L2I0,T)  (Cn[0,T)  - 

{f  :  [0,T]  -*  f  is  continuous))  such  that  N  •  T»  B  and  S  +  N  m  T  •  Z,  where 
t 

Z(«,t)  -  J  Zg(dx)£(u>,x)  +  B(u),t),  provided  P2  «e  Pg. 


It  can  be 
valent  to 
P^  of  the 

with 
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II)  If  I)  obtains,  P^^  «  Pjy  and 


dPS+A//dPM  ^  *  J  .  dPZ/dPB^-)PB/W»h^d-^ 

C  I 1 i 
n 


where  Pg/jy^  is  the  conditional  law  of  IJ  knowing  that  N  =  h. 


Ill)  Pg/jY_h  is  a  point  mass  concentrated  at  m(t,h),  with 

»  10,T]  [0,T] 

m.(t,h)  -  l  - - - 2 - 

k-1  X. 

k 


where  is  an  eigenvalue-eigenvector  couple  for  the  operator  R^  on 

N  ^ 

L  (0,T]  defined  by  (R  f)(t)  ■  /  Rw(t,x) f (x)dx.  N  being  mean  square  continuous, 

N  ° 

Rjy  is  continuous  and  thus  R  has  finite  trace: 


rN  *  Vk  0  *k  ([a®bl(x>  “  <x.b>L2[0,T]a) 


t 

The  function  f^t,*)  is  given  by  f^(t,x)  -  /  F^(x,u)  S^(du) .  Hence  -  almost 

0 

surely. 


dPs+w/dPN  (h)  *  dPz/dPB  (m ( • , h) ) 


One  need  thus  define  explicitly  the  function  on  C  [0,T],  dP_/dP_. 

n  l  B 

Let  n(£,t)  «  £(t),  £C  cnfO,T].  It  can  be  shown  that  if  Z  is  the  solution  of 
the  stochastic  differential  equation 


t 

Z(u>,t)  -  /  I-(dx)a(Z(w,*),x)  +  B(u>,t)  , 

0 
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where  <*( 7A.us ,  • ) , x)  is  a  function  of  £(u>,u),  for  u<x,  then,  Pg  -  almost  surely 


T 

dPz/dPB(c)  =  exp  {  f(a(c,x)  ,Il(c,  dx)) 
0 

T 


-  J  f  (a(£,x),X  (dx)a(£,x)>  } 

*  0  n 


The  term  /  (ot(c,x),  II(c,dx))  is  a  stochastic  integral  with  respect  to  the 
0  “ 

process  {n(*,t),te  [0,T]}  defined  on  the  probability  space  (C  [0,1], 8(C  [0,1]), P_). 

n  n  15 

Since  Pg  and  are  orthogonal,  one  cannot  simply  evaluate  the  exponential 

containing  the  stochastic  integral  at  the  point  m(*,h).  However,  one  can  ap¬ 
proximate  dPz/dPg  (m(*,h))  in  Lj[P^]  sense,  replacing  dPz/dPg  by  any  LjtPg]- 
approximation,  which  is  useful  for  practical  applications. 

The  explicit  representation  of  dP,/dP„  is  due  to  two  facts:  one  is 

6  D 

Girsanov's  theorem,  which  gives  conditions  under  which  the  translation  of  a 
martingale  by  a  process  of  bounded  variation  is  again  a  martingale  with  the 
same  local  characteristics  (in  fact,  in  its  generality,  the  theorem  says  that 
the  class  of  semimartingales,  that  is  sums  of  martingales  and  processes  of 
bounded  variation,  is  invariant  under  absolutely  continuous  changes  of  probability 
measures),  and  the  other  is  the  characterization  of  Brownian  motion  as  a  continuous 
martingale.  Since  these  two  facts  are  valid  for  example  for  processes  with  inde¬ 
pendent  increments,  the  results  just  stated  are  valid  for  processes  other  than 
the  Gaussian  ones.  Finally,  it  is  also  possible  to  have  a  with  an  infinite 
number  of  components. 


